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Predictive Validity 


• Key inference from studies of predictor-criterion 
relationship: the relationship is dependable in the 
specified setting (Messick, 1988, p. 36) 

• Ordinary least squares (OLS) regression is 
commonly used for studies of predictive validity 

• In a wide variety of settings this model is 
adequate and can provide high quality evidence 
to support the validity of a particular use of a test 
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Criterion Choice in Higher Education 


• The validation of undergraduate college 
admissions measures commonly considers first- 
year grade point average (FYGPA) as a criterion 

• FYGPA is meaningful as we expect it to be 
related to: 

a) subsequent college performance; 

b) probability of being retained to the second year; and 

c) probability of graduating within some finite timeframe. 
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Some Known Problems with FYGPA 


• Students take different courses 

• Smits, Mellenbergh, & Vorst. (2002) proposed imputing 
individual course grades. 

• Courses vary in grading practices 

• Strieker, et al. (1994) review adjustments to FYGPA for 
different grading standards across departments 

• Courses vary in difficulty 

• Young (1990) proposes using IRT-based methods to 
adjust course grades 
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Stakeholders are Still Drawn to FYGPA 


• Despite the avowed issues, stakeholders still feel 
that FYGPA is a good heuristic for early college 
performance 

• Admissions officers may consider predicted FYPGA in 
the admissions process 

• This prediction problem is presents certain problems 

• Problem for OLS model: non-normal residuals 

• How can we retain the intuitive appeal of using FYGPA, 
but still overcome the non-normality of residuals? 
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Sample & Measures 


• Sample 

• cohort of 173,963 entering 129 4-year institutions in the 
fall of 2008 as first-year students 

• institutions varied on size, admittance rate, control 

• Measures 

• self-reported high school GPA from SAT Questionnaire 

• SAT critical reading, mathematics, writing 

• institution-supplied FYGPA 
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Density 


Criterion is Negatively Skewed 



Note. Student-level density for 173,963 students across 129 institutions. 
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Density 


Residuals: Negative Skew & Positive Kurt 



OLS Residual for Untransformed FYGPA 


Note. Student-level density for 173,963 students across 129 institutions. 
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Residuals: Negative Skew & Positive Kurt 


negative 

skewness 



OLS Residual for Untransformed FYGPA 


Note. Student-level density for 173,963 students across 129 institutions. 
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Residuals: Negative Skew & Positive Kurt 


negative 

skewness 



OLS Residual for Untransformed FYGPA 


Note. Student-level density for 173,963 students across 129 institutions. 
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Understanding Residual Skewness 


• Residuals exhibit negative skew because there 
are fewer, more extreme negative values and 
more, but less extreme positive values 

• In other words, a small number of students performed 
much worse than expected under the proposed model, 
while relatively more students slightly over-performed 

• Could additional — perhaps non-cognitive — measures 
account for that under-performance? 

• Colloquially, consider the awful roommate effect 

• Is there a ceiling effect, since the mean FYGPA is very 
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Box-Cox (1964) Transformation 


• Box & Cox’s (1964) proposed transformation 


• Properties 

• monotonic (i.e., maintains ordering) 

• flexible functional form (e.g., useful for pos. / neg. skew) 

• when A = 1 , the transformed values only differ from original 
variable by a constant 

• may reduce root mean square error of prediction 

• A is unknown and either must be estimated or based on prior 
research or subject-matter knowledge 



for A * 0 
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Examples of Box-Cox Transformation 



Criterion 
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Examples of Box-Cox Transformation 



Criterion 
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Examples of Box-Cox Transformation 



Criterion 
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Examples of Box-Cox Transformation 
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Predictions on Original Scale 


• One appeal of FYGPA is a typical 0 to 4 scale 

• The scale of the Box-Cox transformation is different 

• Predictions are on that unfamiliar scale 

• Solution : back-transform onto the FYGPA scale 

. Yjj Hat = (>\ Hat ■ Zjj Hat + 1) A (1 />\ Hat ) 

• Not all values may be back-transformed 

• If the predicted value is less than -1 and A is not an 
integer, we cannot back transform 
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• Consider restricting A to integer values 



Sample Estimates of Lambda 


A 


• Comments on estimates 

Mean 

2.25 

• simple mean A-hat = 2.25 

SD 

0.68 

• all A-hat are positive 

Minimum 

0.97 

• all A-hat > 1 , when rounded 

25th Pctile 

1.76 

• 50% of sample estimates 

Median 

2.22 

range from 1 .76 to 2.68 

75th Pctile 

2.68 

• Makes sense, with negative 
skewness 

Maximum 
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4.51 

• A > 1 expands scalar 
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Density 


Comparison of Residual Skewness 



Note. Institution-level density for 129 institutions and 173,963 students. 
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Density 


Comparison of Residual Skewness 



Note. Institution-level density for 129 institutions and 173,963 students. 
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Density 


Comparison of Residual Kurtosis 



Residual Kurtosis 


Note. Institution-level density for 129 institutions and 173,963 students. 
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Density 


Comparison of Residual Kurtosis 



Residual Kurtosis 


Note. Institution-level density for 129 institutions and 173,963 students. 
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Comparison of Root Mean Square Error 


RMSE 


Statistic 

OLS 

BC 

Mean 

0.58 

0.27 

SD 

0.16 

0.07 

Minimum 

0.31 

0.10 

25th Pctile 

0.45 

0.22 

Median 

0.55 

0.26 

75th Pctile 

0.69 

0.32 

Maximum 

0.98 

0.46 
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Reduces RMSE 

• In our sample, the RMSE 
for the Box-Cox model is 
about half of what it is for 
the OLS model 

• May use Box-Cox 
transformation even with 
normal residuals 

• More precise estimates of 
FYGPA are possible 
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Summary & Limitations 


• In this model, FYGPA residuals are not normal 

• Using a Box-Cox transformation on FYGPA 
reduces non-normality 

• Original scale may be recovered for prediction 

• There are limitations of this approach for both 
prediction and inferences around criterion validity: 

• Original scale may not be recoverable for all predictions 

• The need to estimate A increases Type I Error, unless 
properly adjusted 
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Questions, Comments, Suggestions 


• Researchers are encouraged to freely express 
their professional judgment. Therefore, points of 
view or opinions stated in College Board 
presentations do not necessarily represent official 
College Board position or policy. 


• Please forward any questions, comments, and 
suggestions to: 
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• bpatterson@colleqeboard.org 


CollegreBoard 

inspiring minds' 


References 


Box, G. E. P. & Cox, D. R. (1964). An analysis of transformations. Journal of the 
Royal Statistical Society: Series B (Methodological), 26(2), 21 1 -252. 

Messick, S. (1988). The once and future issues of validity: Assessing the 
meaning and consequences of measurement. In H. Wainer & H. I. Braun 
(Eds.), Test Validity (pp. 33- 45). Hillsdale, NJ: Lawrence Erlbaum 
Associates, Inc. 

Smits, N., Mellenbergh, G.J. and Vorst, H.C.M. (2002). Alternative Missing Data 
Techniques to Grade Point Average: Imputing Unavailable Grades. Journal 
of Educational Measurement, 39(3), 187-206. 

Strieker, L.J., Rock, D.A., Burton, N.W., Muraki, E., and Jirele, T.J. (1994). 
Adjusting college grade point average criteria for variations in grading 
standards: a comparison of methods. Journal of Applied Psychology, 79(2), 
178-183. 

Young, J.W. (1990). Adjusting the cumulative GPA using item response theory. 


26 


Journal of Educational Measurement, 27(2), 175-186. 



